Search CORE

14 research outputs found

Handling non-compositionality in multilingual CNLs

Author: Enache Ramona
Kolachina Prasanth
Listenmaa Inari
Publication venue
Publication date: 01/01/2014
Field of study

In this paper, we describe methods for handling multilingual non-compositional constructions in the framework of GF. We specifically look at methods to detect and extract non-compositional phrases from parallel texts and propose methods to handle such constructions in GF grammars. We expect that the methods to handle non-compositional constructions will enrich CNLs by providing more flexibility in the design of controlled languages. We look at two specific use cases of non-compositional constructions: a general-purpose method to detect and extract multilingual multiword expressions and a procedure to identify nominal compounds in German. We evaluate our procedure for multiword expressions by performing a qualitative analysis of the results. For the experiments on nominal compounds, we incorporate the detected compounds in a full SMT pipeline and evaluate the impact of our method in machine translation process.Comment: CNL workshop in COLING 201

arXiv.org e-Print Archive

Crossref

Automatic conversion of colloquial Finnish to standard Finnish

Author: Francis M Tyers
Inari Listenmaa
Publication venue
Publication date: 24/04/2020
Field of study

Abstract This paper presents a rule-based method for converting between colloquial Finnish and standard Finnish. The method relies upon a small number of orthographical rules combined with a large language model of standard Finnish for ranking the possible conversions. Aside from this contribution, the paper also presents an evaluation corpus consisting of aligned sentences in colloquial Finnish, orthographically-standardised colloquial Finnish and standard Finnish. The method we present outperforms the baseline of simply treating colloquial Finnish as standard Finnish, but is outperformed by a phrase-based MT system trained by the evaluation corpus. The paper also presents preliminary results which show promise for using normalisation in the machine translation task

CiteSeerX

An End-to-End Pipeline from Law Text to Logical Formulas

Author: Listenmaa Inari
Ranta Aarne
Soh Jerrold
Wong Meng Weng
Publication venue: 'IOS Press'
Publication date: 01/01/2022
Field of study

We propose a pipeline for converting natural English law texts into logical formulas via a series of structural representations. Text texts are first parsed using a formal grammar derived from light-weight annotations. An intermediate representation called assembly logic is then used for logical interpretation and supports translations to different back-end logics and visualisations. The approach, while rule-based and explainable, is also robust: it can deliver useful results from day one, but allows subsequent refinements and variations

Institutional Knowledge at Singapore Management University

Chalmers Research

Constraint Grammar as a SAT problem

Author: Lindstr\uf6m Claessen Koen
Listenmaa Inari
Publication venue
Publication date: 01/01/2015
Field of study

We represent Constraint Grammar (CG) as a Boolean satisfiability (SAT) problem. Encoding CG in logic brings some new features to the grammars. The rules are interpreted in a more declarative way, which makes it possible to abstract away from details such as cautious context and ordering. A rule is allowed to affect its context words, which makes the number of the rules in a grammar potentially smaller. Ordering can be preserved or discarded; in the latter case, we solve eventual rule conflicts by finding a solution that discards the least number of rule applications. We test our implementation by parsing texts in the order of 10,000s–100,000s words, using grammars with hundreds of rules

Chalmers Research

Automatic test suite generation for PMCFG grammars

Author: Lindstr\uf6m Claessen Koen
Listenmaa Inari
Publication venue: 'EasyChair'
Publication date: 01/01/2018
Field of study

We present a method for finding errors in formalized natural language grammars, by automatically and systematically generating test cases that are intended to be judged by a human oracle. The method works on a per-construction basis; given a construction from the grammar, it generates a finite but complete set of test sentences (typically tens or hundreds), where that construction is used in all possible ways. Our method is an alternative to using a corpus or a treebank, where no such completeness guarantees can be made. The method is language-independent and is implemented for the grammar formalism PMCFG, but also works for weaker grammar formalisms. We evaluate the method on a number of different grammars for different natural languages, with sizes ranging from toy examples to real-world grammars

Chalmers Research

Formal Methods for Testing Grammars

Author: Listenmaa Inari
Publication venue
Publication date: 15/02/2019
Field of study

Grammar engineering has a lot in common with software engineering. Analogous to a program specification, we use descriptive grammar books; in place of unit tests, we have gold standard corpora and test cases for manual inspection. And just like any software, our grammars still contain bugs: grammatical sentences that are rejected, ungrammatical sentences that are parsed, or grammatical sentences that get the wrong parse. This thesis presents two contributions to the analysis and quality control of computational grammars of natural languages. Firstly, we present a method for finding contradictory grammar rules in Constraint Grammar, a robust and low-level formalism for part-of-speech tagging and shallow parsing. Secondly, we generate minimal and representative test suites of example sentences that cover all grammatical constructions in Grammatical Framework, a multilingual grammar formalism based on deep structural analysis

Göteborgs universitets publikationer - e-publicering och e-arkiv

Analysing constraint grammars with a SAT-solver

Author: Claessen Koen
Listenmaa Inari
Publication venue
Publication date: 01/01/2016
Field of study

We describe a method for analysing Constraint Grammars (CG) that can detect internal conflicts and redundancies in a given grammar, without the need for a corpus. The aim is for grammar writers to be able to automatically diagnose, and then manually improve their grammars. Our method works by translating the given grammar into logical constraints that are analysed by a SAT-solver. We have evaluated our analysis on a number of non-trivial grammars and found inconsistencies

Chalmers Research

Automatic conversion of colloquial Finnish to standard Finnish

Author: Listenmaa Inari
Tyers Francis M.
Publication venue
Publication date: 01/01/2015
Field of study

This paper presents a rule-based method for converting between colloquial Finnish and standard Finnish. The method relies upon a small number of orthographical rules combined with a large language model of standard Finnish for ranking the possible conversions. Aside from this contribution, the paper also presents an evaluation corpus consisting of aligned sentences in colloquial Finnish, orthographically-standardised colloquial Finnish and standard Finnish. The method we present outperforms the baseline of simply treating colloquial Finnish as standard Finnish, but is outperformed by a phrase-based MT system trained by the evaluation corpus. The paper also presents preliminary results which show promise for using normalisation in the machine translation task

Chalmers Research

Large-Scale Hybrid Interlingual Translation in GF: a Project Description.

Author: Angelov Krasimir
Kolachina Prasanth
Listenmaa Inari
Ranta Aarne
Publication venue
Publication date: 01/01/2014
Field of study

Chalmers Research